Finding phonemes: improving machine lip-reading
نویسندگان
چکیده
In machine lip-reading there is continued debate and research around the correct classes to be used for recognition. In this paper we use a structured approach for devising speaker-dependent viseme classes, which enables the creation of a set of phoneme-to-viseme maps where each has a different quantity of visemes ranging from two to 45. Viseme classes are based upon the mapping of articulated phonemes, which have been confused during phoneme recognition, into viseme groups. Using these maps, with the LiLIR dataset, we show the effect of changing the viseme map size in speaker-dependent machine lip-reading, measured by word recognition correctness and so demonstrate that word recognition with phoneme classifiers is not just possible, but often better than word recognition with viseme classifiers. Furthermore, there are intermediate units between visemes and phonemes which are better still.
منابع مشابه
Decoding visemes: improving machine lipreading (PhD thesis)
This thesis is about improving machine lip-reading, that is, the classification of speech from only visual cues of a speaker. Machine lip-reading is a niche research problem in both areas of speech processing and computer vision. Current challenges for machine lip-reading fall into two groups: the content of the video, such as the rate at which a person is speaking or; the parameters of the vid...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کامللبخوانی و ادراک گفتار دانشآموزان کمشنوای مدارس ویژۀ کمشنوایان در شهر تهران
Objective: The goal of this study was to evaluate the lip reading ability and Speech perception of hearing impaired students of special schools for the hearing impaired in different speech levels. Materials & Methods: In this cross- sectional study, 44 deaf students (9-12 years old) were selected with multi-stage cluster sampling method, from two special schools for the deaf in Tehran. Tools...
متن کاملA model for the dynamics of articulatory lip movements
The present work is part of a framework to design and implement a language laboratory for speech reading/lip reading for multiple languages. It is based on the interdisciplinary project LIPPS at Technical University of Berlin, Germany, which aims to develop a training-aid for speech reading by employing a text-driven facial animation from a single passport photo with the help of 2D image morphi...
متن کاملTo build a model for implementing automated lip reading which involves Lip motion feature to text conversion
A speech recognition system has three major components: feature extraction, probabilistic modelling of features and classification. In literature, the general approach is to extract the principle components of the lip movement in terms of the lip shape based properties in order to establish a one-to-one correspondence between phonemes of speech and visemes of lip shape. Several modelling and cl...
متن کامل